So Far (Schematically) yet So Near (Semantically)

نویسندگان

  • Amit P. Sheth
  • Vipul Kashyap
چکیده

ion Level Incompatibility Generalization Conflicts Aggregation Conflicts (Semantic Relationship) (Semantic Relationship) Figure 8: Abstraction level incompatibilities and the likely types of semantic proximities In this case there is an inclusion relationship between two con icting objects and hence, they may be considered to have a semantic relationship. semPro(O1, O2) = 7.2 Aggregation Con icts These con icts arise when an aggregation is used in one database to identify a set of entities in another database. Also, the properties of the aggregate concept can be an aggregate of the corresponding property of the set of entities. Example : Consider the aggregation SET-OF which is used to define a concept in the first database and the set of entities in another database as follows : CONVOY(Id#, AvgWeight, Location) SHIP(Id#, Weight, Location, Captain) Thus, CONVOY in the first database is a SET-OF SHIPs in the second database. Also, CONVOY.AvgWeight is the average(aggregate function) of SHIP.Weight, for every ship that is a member of the convoy. In this case there is a mapping in one direction only, i.e., the an element of a set is mapped to the set itself. In the other direction, the mapping is not precise. When the SHIP entity is known, one can identify the CONVOY entity it belongs to, but not vice versa. Hence two objects might be considered to have a semantic relationship. Thus, the semantic proximity can be de ned as follows : semPro(O1, O2) = 8 Schematic Discrepancies Problem This class of con icts was discussed in [DAODT85, KLK91]. It was noted that these con icts can take place within the same data model and arise when data in one database correspond to metadata of another database. This class of con icts is similar to that discussed in Section 6 when the con icts depend on the database state. We now analyze the problem and identify three aspects with help of an example given in [KLK91]. Example : Consider three stock databases. All contain the closing price for each day of each stock in the stock market. The schemata for the three databases are as follows: Database DB1 : relation r : f(date, stkCode, clsPrice) : : : g Database DB2 : relation r : f(date, stk1, stk2, : : : ) : : : g Database DB3 : relation stk1 : f(date, clsPrice) : : : g, relation stk2 : f(date, clsPrice) : : : g, ... DB1 consists of a single relation that has a tuple per day per stock with its closing price. DB2 also has a single relation, but with one attribute per stock, and one tuple per day, where the value of the attribute is the closing price of the stock. DB3 has, in contrast, one relation per stock that has a tuple per day with its closing price. Let us consider that the stkCode values in DB1 are the names of the attributes, and in the other databases they are the names of relations (e.g., stk1, stk2). 8.1 Data Value Attribute Con ict This con ict arises when the value of an attribute in one database corresponds to an attribute in another database. Thus this kind of con ict depends on the database state. Referring to the above example, the values of the attribute stkCode in the database DB1 correspond to the attributes stk1, stk2, : : : in the database DB2. Since this con ict is dependent on the database state, the fourth component of the 4-tuple describing the semantic proximity plays an important role. Also the mappings here are established between set of attributes (fOig) and values in the extension of the other attribute (O2). Thus the two objects may be considered to be meta semantically equivalent and their semantic proximity can be de ned as follows : semPro(fOig, O2) = where M is a total 1-1 mapping between fOig and S2. 8.2 Attribute Entity Con ict This con ict arises when the same entity is being modeled as an attribute in one database and a relation in another database. This kind of con ict is di erent from the con icts de ned in the previous and next subsections because it depends on the database schema Schematic Discrepancies Data Value Attribute Conflict Attribute Entity Conflict Data Value Entity Conflict (Meta-Semantic Equivalence) (Semantic Equivalence) (Meta-Semantic Equivalence)Figure 9: Schematic Discrepancies and the likely types of semantic proximitiesand not on the database state. This con ict can also be classi ed as a subclass of theEntity De nition Incompatibility Problem. Referring to the example described inthe beginning of this section the attribute stk1, stk2 in the database DB2 correspond torelations of the same name in the database DB3.Objects O1 and O2 can be considered to be semantically equivalent as 1-1 value map-pings can be established between the domains of the attribute (O1) and the domain ofthe identifying attribute of the entity (O2). It should be noted that O1 is an attribute(property) and O2 is an entity (object class). Thus the semantic proximity can be de nedas follows :semPro(O1, O2) = where D1 = Domain(O1)and D2 = Domain(Identi er(O2)).8.3 Data Value Entity Con ictThis con ict arises when the value of an attribute in one database corresponds to a relationin another database. Thus this kind of con ict depends on the database state. Referring tothe example described in the beginning of this section, the values of the attribute stkCodein the database DB1 correspond to the relations stk1, stk2 in the database DB3.Since this con ict is dependent on the database state, the state component of semanticproximity plays an important role. Also the mappings here are established between setof entities (fOig) and values in the extension of an attribute (O2). Thus the two objectsmay be considered to be meta semantically equivalent and their semantic proximity canbe de ned as follows :semPro(fOig, O2) = < ALL, M, (D1, D2), (S1, S2)>where M is a total 1-1 mapping between fOig and S2. 9 ConclusionAn essential prerequisite to achieving interoperability among database systems is to beable to identify relevant data managed by di erent database systems. This requires usto understand and de ne the semantic similarities among the objects. We introducedthe concept of semantic proximity to specify degrees of semantic similarities among theobjects based on their real world semantics, and use it to propose a semantic taxonomy.We also showed how uncertainty measures can be expressed as a function of these semanticproximities. Modeling of several types of inconsistencies is discussed. Thus we establishuncertainty and inconsistency as aspects of semantics.Building upon earlier work on schematic (structural, representational) di erencesamong objects, we develop a taxonomy of schematic con icts. A dual semantic vsschematic perspective is presented by identifying likely types of semantic similarities be-tween objects with di erent types of schematic di erences.We are currently developing a uniform formalism to express various schematic con icts.Additional work is needed to further clarify the nature and structure of the context towhich the two objects can belong, as well as the relationship between an object and thecontext in which the semantic proximity is de ned. We also plan to develop a methodologyof combining various semantic descriptors. We plan to investigate context dependentuncertainty functions which map semantic proximities to fuzzy strengths.AcknowledgementsPeter Fankhauser's comments helped us present the uncertainty function in the properperspective.References[BGMP90] D. Barbara, H. Garcia-Molina, and D. Porter. A Probabilistic RelationalModel. Lecture Notes in Computer Science : Advances in Database Tech-nology EDBT '90, #416, 1990.[BOT86] Y. Breitbart, P. Olson, and G. Thompson. Database Integration in a Dis-tributed Heterogeneous Database System. In Proceedings of the 2nd IEEEConference on Data Engineering, February 1986.[CRE87] B. Czejdo, M. Rusinkiewicz, and D. Embley. An approach to Schema Integra-tion and Query Formulation in Federated Database Systems. In Proceedingsof the 3rd IEEE Conference on Data Engineering, February 1987.[DAODT85] S. Deen, R. Amin, G. Ofori-Dwumfuo, and M. Taylor. The architecture of aGeneralised Distributed Database System PRECI*. IEEE Computer, 28(4),1985. [DeM89] L. DeMichiel. Resolving Database Incompatibility : An approach to perform-ing Relational Operations over Mismatched Domains. IEEE Transactionson Knowledge and Data Engineering, 1(4), 1989.[DH84]U. Dayal and H. Hwang. View de nition and Generalization for DatabaseIntegration of a Multidatabase System. IEEE Transactions on SoftwareEngineering, 10(6), November 1984.[ELN86] R. Elmasri, J. Larson, and S. Navathe. Schema Integration Algorithmsfor Federated Databases and Logical Database Design. Technical report,Honeywell Corporate Systems Develpment Division, Golden Valley, MN,1986.[EN89]R. Elmasri and S. Navathe. Fundamentals of Database Systems. Ben-jamin/Cummins, 1989.[FKN91] P. Fankhauser, M. Kracker, and E. Neuhold. Semantic vs. Structural re-semblance of Classes. SIGMOD Record, special issue on Semantic Issues inMultidatabases, A. Sheth, ed., 20(4), December 1991.[HM85]D. Heimbigner and D. McLeod. A federated architecture for InformationSystems. ACM Transactions on O ce Information Systems, 3,3, 1985.[Ken91]W. Kent. The breakdown of the Information Model in Multidatabase Sys-tems. SIGMOD Record, special issue on Semantic Issues in Multidatabases,A. Sheth, ed., 20(4), December 1991.[KLK91] R. Krishnamurthy, W. Litwin, and W. Kent. Language features for Interop-erability of Databases with Schematic Discrepancies. In Proceedings of 1991ACM SIGMOD, May 1991.[KS91]W. Kim and J. Seo. Classifying Schematic and Data Heterogeneity in Mul-tidatabase Systems. IEEE Computer, 24(12), December 1991.[LA86]W. Litwin and A. Abdellatif. Multidatabase Interoperability. IEEE Com-puter, 19(12), December 1986.[LNE89] J. Larson, S. Navathe, and R. Elmasri. A Theory of Attribute Equivalencein Databases with Application to Schema Integration. IEEE Transactionson Software Engineering, 15(4), 1989.[ME84]M. Mannino and W. E elsberg. Matching techniques in Global SchemaDesign. In Proceedings of the 1st IEEE Conference on Data Engineering,April 1984.[PLS92]M. Papazoglou, S. Laufmann, and T. Sellis. An organizational frameworkfor Cooperating Intelligent Information Systems. In International Journalof Intelligent and Cooperative Information Systems, March 1992. [RSK91] M. Rusinkiewicz, A. Sheth, and G. Karabatis. Specifying InterdatabaseDependencies in a Multidatabase Environment. IEEE Computer, 24(12),December 1991.[SG89]A. Sheth and S. Gala. Attribute relationships : An impediment in automat-ing Schema Integration. In Proceedings of the NSF Workshop on Heteroge-neous Databases, December 1989.[She91a] A. Sheth. Federated Database Systems for managing Distributed, Heteroge-neous, and Autonomous Databases. Tutorial Notes the 17th VLDB Con-ference, September 1991.[She91b] A. Sheth. Semantic issues in Multidatabase Systems. SIGMOD Record,special issue on Semantic Issues in Multidatabases, A. Sheth, ed., 20(4),December 1991.[SL90]A. Sheth and J. Larson. Federated Database Systems for managing Dis-tributed, Heterogeneous and Autonomous Databases. ACM Computing Sur-veys, 22(3), September 1990.[SM91]M. Siegel and S. Madnick. A Metadata Approach to resolving SemanticCon icts. In Proceedings of the 17th VLDB Conference, September 1991.[SM92]A. Sheth and H. Marcus. Schema Analysis and Integration: Methodology,Techniques and Prototype Toolkit. Technical Report TM-STS-019981/1,Bellcore, March 1992.[SRK92] A. Sheth, M. Rusinkiewicz, and G. Karabatis. Using Polytransactions tomanage Independent Data. In Database Transaction Models, 1992.[TCY93] F. Tseng, A. Chen, and W. Yang. Answering Heterogeneous DatabaseQueries with Degrees of Uncertainty. Distributed and Parallel Databases,An International Journal, 1(3), July 1993.[VH91]V. Ventrone and S. Heiler. Semantic Heterogeneity as a result of DomainEvolution. SIGMOD Record, special issue on Semantic Issues in Multi-databases, A. Sheth, ed., 20(4), December 1991.[Woo85]J. Wood. What's in a link ? In Readings in Knowledge Representation.Morgan Kaufmann, 1985.[YSDK91] C. Yu, W. Sun, S. Dao, and D. Keirsey. Determining relationships amongattributes for Interoperability of Multidatabase Systems. In Proceedings ofthe 1st International Workshop on Interoperability in Multidatabase Systems,April 1991.[Zad78]L. Zadeh. Fuzzy Sets as a basis for a Theory of Possibility. Fuzzy Sets andSystems, 1(1), 1978.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

So far and yet so near?

Developments in computer and communication technologies over the past ten years aptly support the use of the term global village to describe the present worldwide provision and distribution of information. These technologies are increasingly being employed in Africa, but their presence is not widespread and evenly distributed. Librarians in Africa are establishing the basic tools and organizati...

متن کامل

The Ergative System in Balochi from a Typological Perspective

For the Western Iranian languages the transition from the Old Iranian to the Middle-Iranian period is characterised by – among other things – the loss of word-final syllables. This loss had a far-reaching impact on the nominal and verbal systems since it caused the loss of categories which had been expressed by suffixes. The consequences include the emergence of the so-called ergative system. ...

متن کامل

Building A Training Corpus For Word Sense Disambiguation In English-To-Vietnamese Machine Translation

The most difficult task in machine translation is the elimination of ambiguity in human languages. A certain word in English as well as Vietnamese often has different meanings which depend on their syntactical position in the sentence and the actual context. In order to solve this ambiguation, formerly, people used to resort to many hand-coded rules. Nevertheless, manually building these rules ...

متن کامل

So similar and yet incompatible: Toward the automated identification of semantically compatible words

We introduce the challenge of detecting semantically compatible words, that is, words that can potentially refer to the same thing (cat and hindrance are compatible, cat and dog are not), arguing for its central role in many semantic tasks. We present a publicly available data-set of human compatibility ratings, and a neural-network model that takes distributional embeddings of words as input a...

متن کامل

روش‎های انطباق و اصلاح اختلال پردازش حس‎های نزدیک درکودکان

Background: Abilities of sensory processes are underlying of effectiveness responding to situation, facilitate to learning, social behavior, and daily function. So, sensory processing disorders might have influence on daily life. The aim of this study was to present some of modification methods and environment accommodation and adapting with children characteristics with sensory processing. Met...

متن کامل

Cell based therapies in retinal diseases

Background Degenerative retinal diseases, including age related macular degeneration, glaucoma, and hereditary retinal dystrophies are major causes of blindness. The principal defect in these diseases is cell loss which is amenable to both cell based neuroprotective and neuroregenerative therapies. To briefly review the lines of research and potential candidates for cell based therapies among ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992